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ABSTRACT 

Yin Yang 1 (YY1) is a multifunctional protein with 
regulatory potential in tumorigenesis. Ample 
studies demonstrated the activities of YY1 in 
regulating gene expression and mediating differen- 
tial protein modifications. However, the mech- 
anisms underlying YY1 gene expression are 
relatively understudied. G-quadruplexes (G4s) are 
four-stranded structures or motifs formed by 
guanine-rich DNA or RNA domains. The presence 
of G4 structures in a gene promoter or the 5'-UTR 
of its mRNA can markedly affect its expression. In 
this report, we provide strong evidence showing the 
presence of G4 structures in the promoter and the 
5'-UTR of YY1. In reporter assays, mutations in 
these G4 structure forming sequences increased 
the expression of Gaussia luciferase (Glue) down- 
stream of either YY1 promoter or 5'-UTR. We also 
discovered that G4 Resolvase 1 (G4R1) enhanced 
the Glue expression mediated by the YY1 
promoter, but not the YY1 5'-UTR. Consistently, 
G4R1 binds the G4 motif of the YY1 promoter 
in vitro and ectopically expressed G4R1 increased 
endogenous YY1 levels. In addition, the analysis of 
a gene array data consisting of the breast cancer 
samples of 258 patients also indicates a significant, 
positive correlation between G4R1 and YY1 
expression. 



INTRODUCTION 

Yin Yang 1 (YYl) is a multifunctional protein that is es- 
sential to differential epigenetic regulation of gene expres- 
sion and protein modifications. As a ubiquitously 
expressed protein, YYl acts as a transcription factor to 
either activate or repress its target genes depending on the 
context of its recruited cofactors (1-3). Structural and 
functional studies indicate that YYl uses distinct 
domains to bind to target promoters and recruit transcrip- 
tional cofactors to modulate gene expression (4). Many 
YYl -interacting proteins possess activities of regulating 
protein modifications. Therefore, YYl potentially 
mediates the modifications of a variety of histone and 
non-histone proteins to determine the expression statuses 
of its target genes. Consistently, YYl has been reported to 
regulate many genes whose products play essential roles in 
cell proliferation and differentiation [reviewed in (2-5)]. 

Many lines of evidence suggest a regulatory role of YYl 
in cancer development. YYl is one of the polycomb group 
(PcG) proteins that are essential contributors to the 
aberrant epigenetics in cancers (6). At the transcriptional 
level, YYl regulates the expression of many cancer-related 
genes, such as c-Myc, c-fos, erbb2, p53, Rb and cdc6 
(7-12), as well as histones H3 and H4 (13,14). During 
apoptosis, YYl colocalizes with p53, binds to a subset 
of p53 DNA-target sites and regulates p53-dependent 
transcription (15). At the post-translational level, YYl as- 
sociates with many proteins with critical regulatory func- 
tions, such as p300, HDAC1,2,3, Ezhl, Ezh2, PRMTl, 
p53, Mdm2, pl4ARF, Rb and mTOR (16-25). YYl is 
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essential to the histone methylation mediated by Ezii2 (on 
H1-K26 and H3-K27) (20,26) and PRMTl (on H4-R3) 
(21). We and others demonstrated that YYl negatively 
regulates p53 through enhancing Mdm2-mediated p53 
ubiquitination and degradation (22,27-29). YYl also 
blocks p300-mediated p53 acetylation (27) and inhibits 
p53-activated transcription (15). Actually, blocking p53 
acetylation can both decrease its transcriptional activity 
(30-32) and facihtate Mdm2-mediated p53 ubiquitination 
and degradation (33). Therefore, the activities of YYl in 
enhancing p53 ubiquitination and inhibiting p53 acetyl- 
ation converge to the same consequence: antagonizing 
p53. These multiple functions and unique properties 
endow YYl with a pivotal role in epigenetic regulation, 
including genomic imprinting and chromatin remodeling. 
Consistently, YYl is highly expressed in human breast 
cancer (9), prostate carcinoma (34), acute myeloid 
leukemia (35), osteosarcoma (36,37) and cervical 
cancer (38). 

Limited research has been carried out to investigate the 
mechanisms underlying how YYl is regulated. YYl gene 
expression can be stimulated by different growth stimuli, 
including insuHn-like growth factor- 1, fibroblast growth 
factor-2 and morphine. Factors that inhibit 
YYl expression include proliibitin, microRNA-29, 
DETANONOate (a nitric oxide donor) and naloxone 
[see review (3)]. YYl may also self-regulate its own expres- 
sion through binding to the intron 1 (39). 

G-quadruplex (G4) is a four-stranded secondary struc- 
ture of DNA or RNA stabilized by Hoogsteen hydrogen 
bonding of guanine quartets and the stacking of these 
planar quartets. A genome-wide survey of the evolution- 
ary conservation of DNA motifs indicated that G4 DNA 
motifs are significantly conserved (40). Increasing evidence 
suggests an important role of G4 DNA structures in 
regulating gene expression (41). Interestingly, G4 DNA 
structures are more enriched in promoters than other 
regions of genomic DNA, especially in genes involved in 
development, survival and proliferation. 

Recent studies implicate a role for G4 DNA in tumori- 
genesis. The telomeric G-rich DNA overhang forms G4 
structures that inhibit the activity of telomerase required 
for the immortalization of most cancer ceUs (42). The 
majority of proto-oncogenes possess G-rich promoters, 
while promoters of tumor suppressors are diminished 
of closely linked G-runs, relative to the genomic average 
(43). Consistently, G4 DNA structures have been 
demonstrated to regulate the expression of a number of 
weU-characterized oncogenes, such as c-Myc, K-RAS, 
Bcl-2 and hTERT. Actually, the six critical cellular and 
microenvironmental processes that are aberrantly 
regulated in oncogenic transformation, as summarized 
by Hanahan and Weinberg (44), are all modulated by 
genes that are regulated by G4 structures (40). 

Several DNA and RNA helicases with the catalytic 
activity to unwind or resolve G4 DNA or G4 RNA struc- 
tures have been identified, including BLM, WRN, 
FANCJ, G4R1 (or G4R1/RHAU, DHX36), RNA 
helicase II (DHX9) and SV40 large T-antigen (45^7). 
These G4 DNA or RNA resolvases can unwind the 
four-stranded G4 structures to a single-stranded form in 



an ATP-dependent fashion (48,49). Among them, G4R1 
possesses tetramolecular and intramolecular quadruplex 
G4 DNA and G4 RNA resolving activity (50-52). 
Importantly, G4R1 has been observed to be increasingly 
expressed in cancers (unpublished data from Akman 
group), suggesting its regulatory role in promoting the 
expression of oncogenes. 

Based on previous studies of YYl in differential 
cancer-related processes, we concluded that YYl hkely 
plays an oncogenic or proHferative role in tumorigenesis 
(3). A previous study indicated that the promoter of 
mouse YYl is G/C rich (53). Therefore, we analyzed the 
primary sequences at the upstream of the human YYl 
coding region, and observed the presence of multiple G- 
or C-rich strings. In this report, we demonstrate that G4 
structures are formed by the oligonucleotides whose se- 
quences are found in the promoter and 5'-untranslated 
region (UTR) of YYl. The alteration of these sequences 
affects the expression of a reporter gene. G4R1 promotes 
the expression driven by the YYl promoter, but not that 
mediated by the YYl 5'-UTR. Consistently, ectopic G4R1 
increases the endogenous YYl levels and G4R1 expres- 
sion positively correlates with YYl in the samples from 
258 breast cancer patients. Overall, our data reveal the 
presence of G4 structures in the YYl promoter and 
5'-UTR, and suggest that G4R1 may modulate YYl ex- 
pression by resolving G4 DNA structure in the YYl 
promoter. 

MATERIALS AND METHODS 

Cell culture and transient transfection 

HeLa and 293T cells were cultured in Dulbecco's modified 
Eagle's medium containing 10% fetal bovine serum. 
Lipofectamine 2000 was used in transient transfection. 
The HeLa cells expressing doxycycline-inducible G4RI 
shRNA were reported previously (54) and the induction 
was achieved by adding a final concentration of 1.5 (ig/ml 
doxycycline in the medium. 

Antibodies and oligonucleotides 

YYl antibody (H-10), histone H3 antibody (FL-136) and 
non-specific mouse IgG (sc-2343) were purchased from 
Santa Cruz Biotechnology. Mouse monoclonal G4R1/ 
RHAU antibody (12F33) was generated against a 
peptide corresponding to the amino acids 991-1007 of 
G4R1, as previously described (55). P-Actin antibody 
was from Cheniicon International Inc. DNA oligonucleo- 
tides or primers were synthesized by Bioneer Inc. and 
Integrated DNA Technologies (IDT), while the RNA 
ohgonucleotides were made by the Biomolecular 
Resource Laboratory, Wake Forest University School of 
Medicine. The anneahng condition for G4 structure for- 
mation followed a pubhshed procedure (52) with modifi- 
cations (Supplementary Figure SI). 

DNA vector construction 

To generate reporter constructs driven by the YYl 
promoter, the —1703 to —1 (with the transcription start 
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site designated as +1; this applies to all following descrip- 
tions) of the YYl promoter region was amplified by PCR 
using AccuPrime Pfx DNA polymerase (Invitrogen) with 
primers Fl: 'cctg gaattc att ggt gtt tat ggg gaa gta tea' and 
Rl: 'ctag tctaga etc gat tct cct etc ggc caa tc' (F: forward; 
R: reverse). The template was the clone RP11-459E8 con- 
taining YYl gene purchased from the BACPAC 
Resources Center (Oakland, CA, USA). The PCR 
fragment was digested by EcoRI and Xbal (underlined 
in the primer sequences) and inserted into EcoRI and 
Nhel digested pGluc Basic (New England BioLabs Inc.), 
of which the multiple cloning site has been modified. To 
mutate the G4 structure-forming sequence between —409 
and —347 in the negative strand of the YYl promoter, we 
synthesized two primers F2: 'ccgtg gcatgc gcc tea ace teg 
etc ccg gcc ggc ccc tc' and R2: 'ccgtg gcatgc gcg gcc egg 
ggg ccg cgc ggg gag' containing a created SphI sites 
(underhned) that facihtated the DNA subcloning and 
altered the guanines essential to the predicted G4 structure 
formation. They were respectively used with the first two 
primers (Fl with R2; F2 with Rl) to amplify the YYl 
promoter as two PCR fragments that were then digested 
by the corresponding restriction enzymes and subcloned 
simultaneously into the modified pGluc Basic vector. As a 
result, the mutated form of the YY 1 promoter containing 
an altered G4 structure-forming sequence was generated. 

To study the contribution of the G4 structure forming 
sequence in the YYl 5'-UTR to Glue expression, we 
amplified the YYl 5'-UTR using primers F3: 'cacg 
acgcgt agg gcg aac ggg cga gtg gca g' and R3: 'cgag 
ggatcc ggc tga ggg etc cgc cgc cac g' with the 
RP11-459E8 plasmid as a template. After the digestion 
of Mlul and BamHI (underhned), the fragment was 
inserted between the PGK promoter and Glue cDNA of 
a reporter construct. To mutate the G4 structure-forming 
sequence in the YYl 5'-UTR, we synthesized two add- 
itional primers F4: 'cgga ggtace egg gga age ccc gcc gcc 
gee' and R4: 'cgga ggtace teg cct egg tgc gcc cgc gcc' that 
both contain a created Kpnl sites to facilitate the 
subcloning and mutate the guanines essential to the pre- 
dicted G4 structure formation. The PCR reactions using 
these primers F3 with R4 and F4 with R3 amplified the 
YYl 5'-UTR into two fragments, which were then simul- 
taneously subcloned into the pPGK-Gluc vector. The se- 
quences of all wild-type and mutated reporter constructs 
described here were confirmed by DNA sequencing. 

Circular dichroism study 

To anneal G-quadruplexes, 20|il of 100pmol/|il DNA or 
RNA oligonucleotides was mixed with 180 ^1 of TE buffer 
(10 mM Tris-HCl, 0.1 mM EDTA, pH 7.5) and annealed 
as described in Supplementary Figure SI. These annealed 
oligonucleotides were diluted to 4|iM in the TE buffer 
supphed with 50 mM KCl. Circular dichroism (CD) 
spectra were recorded on a spectropolarimeter (Aviv 
Model 202 CD Spectrometer, equipped with a thermoelec- 
trically controlled ceU holder) using a quartz ceU of 
0.5 mm optical path length, and over a wavelength range 
from 200 to 350 nm at 25°C. For melting temperature 
scan, we used a temperature range from 20°C to 95°C 



with a constant wavelength of 262 nm. The CD spectra 
were presented with the subtraction of the signal 
contributed by the buffer. 

5'-^^P-end labeling of G4 nucleic acids 

To produce a 5'-^^P-labeled G4 oligonucleotide, an 
aliquot (<1/10 of the final volume for the labehng 
reaction) of the annealed G4 ohgonucleotide was 
incubated with T4 polynucleotide kinase (Promega 
Corp.) and -/-^"^P-ATP for 30min at 37°C, according to 
the manufacturer's instructions. The 5'-^^^P-labeled G4 
ohgonucleotides were purified with a MicroSpin G25 
column (GE Healthcare) equihbrated with TEK buffer 
(lOmM Tris, ImM EDTA and 50 mM KCl) and stored 
at -20°C. 

Dimethyl sulfate footprinting 

Dimethyl sulfate (DMS) footprinting was carried out fol- 
lowing a modified version of previously published proto- 
cols (56,57). A purified 5'-^^P-labeled oligonucleotide was 
annealed in the absence and presence of 100 mM KCl or 
lOOmM LiCl, and 1 |ig/^l of sonicated salmon sperm 
DNA. DMS (Sigma) dissolved in ethanol (DMS:ethanol, 
4/1, v/v) was added to the oligonucleotide solution (0.7 (xl 
to a total volume of 49 and incubated at room tem- 
perature for 3min. The reaction was stopped by adding 
two volumes of the stop solution (1.5 M sodium acetate, 
pH 7.0, 1.0 M P-mercaptoethanol and 0.5ng/|il tRNA). 
The DNA was precipitated with four volumes of ethanol 
and resuspended in 1.0 M piperidine (Sigma). After 
cleavage at 95°C for 30min, the DNA was precipitated 
by adding 20 (ig of glycogen (Invitrogen), one-ninth 
volume of 3M sodium acetate (pH 5.2) and two 
volumes of ethanol. For the Maxam-Gilbert chemical 
G+A sequencing reaction, an ohgonucleotide was 
treated by formic acid and piperidine following a 
standard protocol (58). The samples were resuspended in 
90% formamide and 20 mM EDTA, denatured at 95°C 
for 3niin and run for 2-3 h on an 18% denaturing poly- 
acrylamide gel (Bio-Rad) in Ix TBE and 8.0 M urea. 
After the electrophoresis, the gel was fixed in a solution 
containing 50% methanol and 10% polyethylene glycol 
400, and dried at 80°C for 3h followed by 
autoradiography. 

In vitro DNase I footprinting of the YYl promoter 
regions 

As previously demonstrated by Sun et al. (59,60), the G4 
structures and i-motifs formed by G-rich and C-rich 
regions, respectively, are resistant to DNase I cleavage. 
The experiments followed the procedure described by 
Sun (60). We first subcloned a fragment (—1180 to 
—329) of the YYl promoter into pGL3-Basic vector 
(Promega) between Hindlll and Xhol sites. The generated 
vector pGL3/YYl-short-prmt included the G4 structure 
forming or the YP-3 sequence (—409 to —347). This 
plasmid (2|ig in 25|il) was incubated at 37°C overnight 
in 50 mM Tris-HCl, pH 7.6 in the absence or presence of 
lOOmM KCl. The sample was then mixed with 2^1 of 
0.1 U/^l DNase 1 and incubated at ambient temperature 
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for 2min, immediately followed by DNA precipitation 
and primer extension reaction using ^"P-labeled primer 
PI (CTT TCT TTA TGT TTT TGG COT CTT) located 
downstream of the inserted YYl promoter fragment and 
Thermo Sequenase (Affymetrix Inc.). Meanwhile, the 
G-rich negative strand of the YYl promoter fragment in 
the untreated plasmid was sequenced by the same primer 
using Thermo Sequenase Cycle Sequencing Kit (Cat# 
78500, Affymetrix Inc.) following the procedure 
provided by the manufacturer. These samples were 
resolved by a 6% denaturing polyacrylamide gel 
(Bio-Rad) in Ix TBE and 8.0 M urea at constant 55 W 
for 3 h. After the electrophoresis, the gel was dried at 80° C 
for 2h followed by autoradiography. 

Electrophoretic mobility shift assays 

Recombinant G4R1 purified as described previously (50) 
at concentrations of 10-120 pM was incubated with 1 pM 
of 5'-^^P-labeled G4 nucleic acid in RES-EDTA buffer 
(100 mM KCl, 10 mM NaCl, 3mM MgClz, 50 mM Tris- 
acetate, pH 7.8, 70 mM glycine, 0.012% bovine 
a-lactalbumin, 10% glycerol, 10 mM EDTA) at 37°C for 
30min. Binding mixtures were then analyzed by 10% 
non-denaturing polyacrylamide gel. Electrophoresis was 
performed at 70 V for 10 h in a cold room. Gels were 
imaged on a Typhoon 9210 Imager (GE Healthcare). 
The experiments determining the effect of ATP on 
G4R1/YP-3 association were carried out as previously 
described (61). An amount of 1 pM of 5'- P-labeled 
self-annealed YP-3 was incubated with different amounts 
(25, 75 and 300 pM) of G4R1 in the presence and absence 
of 5mM ATP at 37°C for 30min. The samples were 
analyzed on 10% non-denaturing polyacrylamide gel at 
55 V for 18h, followed by the same imaging procedure 
described above. 

Reporter assay 

293 T cells cultured in 24-well plates were transfected with 
200 ng of the reporter constructs containing the YYl 
promoter, 5'-UTR or their mutant forms with altered se- 
quences in the potential G4 structure-forming sequences, 
and 2ng of a control plasmid pCMV/SEAP (secreted 
alkaline phosphatase). To detect the effect of G4R1 on 
the YYl promoter or 5'-UTR, 500 ng of G4R1 expression 
plasmid or empty vector was cotransfected with 200 ng of 
reporter plasmid and 2ng of pCMV-SEAP plasmid. 
Aliquots of medium from the transfected wells were col- 
lected 48 h post-transfection to measure Gaussia luciferase 
(Glue) activity then normalized against the SEAP activity 
in the same sample, according to the procedure described 
by us (62). Each condition was tested in triplicate and 
repeated over three times. 

Chromatin immunoprecipitation assay 

Chromatin immunoprecipitation (ChIP) assays were per- 
formed as previously reported (63). Samples immunopre- 
cipitated by a control IgG, G4R1 antibody and histone 
H3 antibody were analyzed with Real-Time PCR using the 
FastStart Universal SYBR Green Master (Roche 
Diagnostics GmbH) and the primers F5: 'ccc gaa gcc 



agg cga caa gaa c' and R5: 'gtg caa cag cca caa aac ccg'. 
The F5 and R5 are located at the upstream (—525) and 
downstream (—208) of the potential G4Rl-binding site, 
respectively, in the YYl promoter. As a control, we also 
designed two primers F6: 'atg eta agg cca aaa aca acc agt 
g' and R6: 'tga aac gag att aca gag caa gat a' that are 
located in the YYl exon 5 (+1700 and +1948, respectively) 
and amplify a fragment with relatively low G/C contents 
(34.6% of G/C; 17.3% of each). 

Microarray analysis of YYl and G4R1 expression profiles 

The Uppsala breast cancer cohort with the tumor samples 
from 258 breast cancer patients (64) profiled on the 
Affymetrix U133A and U133B GeneChips were accessed 
via the caArray website (https://array.nci.nih.gov/ 
caarray/project/details.action?project. experiment. public 
Identifier = mille-00271), accession id: mille-00271. The 
microarray data were MAS5.0 normalized by scaling the 
mean of each array to a target signal intensity of 500 and 
log (base 2) transformed. Multiple correlated probe sets 
corresponding to YYl and G4R1 (or DHX36) were 
identified. Three YYl probe sets (U133A: 213494_s_at, 
201901_s_at and 200047_s_at) and three G4R1 probe 
sets (U133B: 223138_s_at, 223139_s_at and 223140_s_at) 
were averaged together to represent expression profiles of 
YYl and G4R1, respectively. The correlation of YYl to 
G4R1 expression was evaluated by Pearson correlation 
using SigmaPlot 11.0 software. 

Statistical analysis 

All data in reporter assays and qPCR are presented as 
mean ± SD. Comparisons between two groups on a 
single parameter were conducted using Student's t-test. 
Statistical analyses were performed using KaleidaGraph. 
The criterion for statistical significance was set at _P < 0.05. 

RESULTS 

The YYl promoter and 5'-UTR are highly G/C-rich and 
contain potential G4 DNA and G4 RNA structure- 
forming sequences, respectively 

For a specific gene, G4 DNA structure can be formed by 
either the positive or the negative strand of its promoter, 
while G4 RNA structure in the 5'-UTR will only be 
present in the mRNA coded by the positive strand. 
Thus, the G/C content in a promoter and G content in a 
5'-UTR are essential determinants to the formation of G4 
DNA and G4 RNA motifs, respectively. A previous study 
by Seto's group demonstrated that the promoter region 
within 1500-bp upstream of human YYl transcription 
start site did not show marked difference in reporter 
assays compared to the fragment up to —3600 bp (65). 
Therefore, we first analyzed the G/C contents in different 
promoter regions from the YYl transcription start site 
(designated as +1) up to its 1500-bp upstream (—1500), 
and the 5'-UTR of the YYl mRNA. As shown in 
Figure lA, the G/C contents of the YYl promoter 
increase monotonically as the analyzed region gets closer 
to the transcription start site. Remarkably, the G/C 
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G/C contents ill tlieYYl promoter and 5'-UTR: 



Promoter 



5-UTR 



-1.500 



-1.000 



-500 



-l.-H 
J 



4^480 



- 78.4% -— 80.0% — « 
76.8% (G alone: 44.4%) 



67.3<' 



56.9» 



B 



(G/C contents) 



-1,000 

TCCGTGCCACAAAAAAAAATCTAGGCTCTGTTGCAGGTACAATGGAGGACACGGCTGAAAAAATTTGGAATTTTAARTGAGACARA 

YP-1: -866 (Pos) 

TGCAAAACCTGGTGGGCGTAAAAAGGAGCACCTATGAAAGTGACAAATAGGGGGAAAGGGTGGGCAAGGGAAACAATGGCTGACTG 

GAGAGCAAAGAAGGGGAAGCTCAGGAGAAAATTTTAGAAAGCCGCCAGGACCCTGTAGTTCTAAGATCTACGGGGAAACAGGCACC 

CAACGGCTGCGTCTCAGGTTTCCGCGGGTCACTAAAGAATAACGGACATCCTCCCAACGGTGGCCCTGGGGCTCCGCGGGCGCTTC 
YP-2: -644CNeg) 

CGCCGAGCTCGCGCCGACCCCGCGCTCGGCCCCGCACCCCGCCGGGCGCTCGCGGCGAGATACCGGACGCTGCCCGCGTCGCCCGA 

-500 

TTTTGTCCGTTCGGTCCTCCACACTCACCCCGCGGCCATCGCTCGCCCGAAGCCAGGCGACAAGAACAAACACCTCCCGACGCGAA 

YP-3: -409 (Ntg) 

AAAGGAAGCACAGGCGATTCTCGTCAAAGCAGACTTTATTGGGGCGACAGGGCCGCCCCGCACGCGCCAGCCGCTCCCCGCGCGGC 
CCCCGGGCCGCCCACCCGCCTCAACCCCGCTCCCGGCCGGCCCCTCCCTCCCTTCTCCTCAGGCTCCCGCCCCCGTGGTGCCCGGG 

GCCGCGCGGACCGCTCACCGGCTCCCAAGGCAGCGGCTGTAGCGGCGACGCCCCGTTCCCGAGTGCGGCCCCGGCCCGAGGCGGCG 

GGTTTTGTGGCTGTTGCACCGCGAAGGGCGGCAGCCGCGCGACACCGGGAAGCGGGAGGCGGTGGCGGCGGCGGCGGCGCGCTGAC 

YP-4: -105 CNeg) 

GTCACGCGCCGCGGGCCAGCCAGGGCGCGTGCGAGCCGCCCCGCCCCCGGTCCCATCGGCCCCAATCCGGGAGGAGCCCGGCGAGT 

-1+1 YU-l:-HCPos) 

GGGCGGGGCCGCGGAGGCCAGCGGACAGATCGATTGGCCGAGAGGAGAATCGAG agggcgaacgggcgagtggcagcgaggcggg 

gcgggctgaggccagcgcggaagtctcgcgaggccgggcccgagcagagtgtggcggcggcggcgagatctgggctcgggttgagg 

YU-2: -H33 (Pos) YU-3; -H68 (Pos) 

agttggtatttgtgtggaaggaggcgga ggcgcagga ggaagggggaagc ggagcgccgg cccggagggcgggaggaggcgcggcc 

YU-4: +:"1 (Pos) 

agggcgggcggttgcggcgaggcgaggcgaggcggggagccgagacgagcagcggccgagcgagcgcgggcgcgggcgcaccgagg 

cgagggaggcggggaagccccgccgccgccgcggcgcccgccccttcccccgccgcccgccccctctccccccgcccgctcgccgc 

cttcctccctctgccttccttccccacggccggccgcctcctcgcccgcccgcccgcagccgaggagccgaggccgccgcggccgt 
+4S1 

ggcggcggagccctcagcc ATG 

Figure 1. Schematic primary structm-e and sequence of tlie YYl promoter and 5'-UTR. (A) Primary structm-e and the G/C contents of the YYl 
promoter and 5'-UTR. The percentages of G/C contents in the 1500 bp- YYl promoter and the 5'-UTR, as well as the G content in the 5'-UTR, are 
indicated. The transcription start site is designated as '+1'. (B) The DNA sequence of the YYl promoter (1000 bp) and 5'-UTR. The G4 DNA 
candidates in the YYl promoter, either the positive or negative strands, are in blue text, while the G4 RNA candidates in the YYl 5'-UTR are in red 
text. The promoter is shown in capital letters, while the 5'-UTR is in lower case. YP-1, -2, -3 and -4 in the YYl Promoter (YP) and YU-1, -2, -3 and 
-4 in the YYl 5'-UTR (YU) indicate the positions of the oligonucleotides shown in Figure 2A. The numbers in green text indicate the positions of 
the first nucleotides in these candidate G4 structure-forming sequences, which are identified by the dots beneath them. The underhned sequence in the 
YYl 5'-UTR is the overlapped region of YU-2 and YU-3. Neg: negative strand; Pos: positive strand. 



content from -500 to -1 is >76.8%. The G/C content in 
the YYl 5'-UTR is 80.0%, with a G content of 44.4%, 
markedly higher than the average of the four nucleotides. 
These analyses clearly indicate that YYl has great poten- 
tial of containing G4 motifs in its promoter and 5'-UTR 
regions. 

Since the G/C content from -1000 to -1 in the YYl 
promoter is >67.3%, we chose to analyze this region for 
potential G4 DNA structure-foiTning sequences based on 
the algorithm proposed in previous literature (66). As a 
result, we identified four elements (YP-1 to YP-4; YP: 
YYl Promoter) that may form G4 DNA structures on 
either the positive strand (YP-1) or negative strand 
(YP-2, -3 and -4) of the YYl promoter (blue text in 
Figure IB). The analyses of the YYl 5'-UTR also 
revealed four candidate sequences (YU-1 to YU-4; YU: 
YYl 5'-UTR) for G4 RNA structure formation (red text 
in Figure IB). 



Analyses of oligodeoxyribonucleotides derived from 
the YYl promoter and 5'-UTR by CD spectroscopy 
indicate the formation of G4 structures 

To examine G4 DNA or RNA structure formation by the 
candidate sequences in the YYl promoter and 5'-UTR 
shown in Figure IB, we designed oUgodeoxyribonu- 
cleotides based on either their original or complementary 
sequences (Figure 2A). For the candidates in the YYl 
5'-UTR, we planned to use their sequences to make 
oUgodeoxyribonucleotides in this primary screening 
analysis based on the structural similarity between G4 
DNA and G4 RNA, and then confirm the presence of 
G4 RNA structure using the oligoribonucleotides. We in- 
dividually annealed these oligonucleotides as described in 
Supplementary Figure SI. As a control, we also 
synthesized a sequence located in the human c-Myc 
promoter that forms a well-characterized G4 DNA 
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structure (c-Myc G4 DNA, Figure 2 A) (67). These 
annealed samples were then analyzed by CD spectroscopy. 

In all CD studies, the annealed c-Myc G4 oUgodeoxyr- 
ibonucleotide displayed peaks of positive molar elUpticity 
at 262 nm and negative molar ellipticity at 240 nm, which 
is characteristic of a parallel G4 structure (68). 
OUgodeoxyribonucleotides YP-3 and YU-4 displayed 
peaks of positive molar ellipticity at 262 nm (closed tri- 
angles, Figure 2B), indicative of the presence of a 
parallel G4 structure in this in vitro condition. However, 
a second peak of positive molar ellipticity at 295 nm was 
also observed with both oligodeoxyribonucleotides, sug- 
gesting the presence of a G4 structure with anti-parallel 
strands. To confirm that these signature peaks indicate G4 
structure, we synthesized two control OUgodeoxyribonu- 
cleotides YP-3M and YU-4 M, in which the guanines with 
predicted roles in G4 structure formation of YP-3 and 
YU-4 were replaced by adenines (Figure 2A). These 
mutated ohgonucleotides exhibited altered CD spectra 
with a single peak not coincident with either the 262 or 
295 nm peaks observed with the wild-type oligodeoxyribo- 
nucleotides (open triangles, Figure 2B). These results 
indicate that the ohgodeoxyribonucleotides YP-3 and 



YU-4 form G4 structures with a mixture of parallel and 
anti-parallel strands. 

The spectra of the other three oligodeoxyribonu- 
cleotides YP-1, YP-2 and YP-4 whose sequences are 
found in the YYl promoter gave no indication of G4 
structure (Figure 2B). Among the other three ohgodeox- 
yribonucleotides whose sequences are located in the YYl 
5'-UTR, the CD spectra of YU-1 and YU-2, but not 
YU-3, were similar to that of YU-4, although the peak 
heights of the YU-2 spectrum were substantially dimin- 
ished (Figure 2B), suggesting that these two ohgodeoxyr- 
ibonucleotides can also form G4 structures with a mixture 
of parallel and anti-parallel strands. 

DMS footprinting assays of YYl G4 DNA structures 

The YP-3 oligonucleotide contains 7-9 separate G-runs 
(Figure 2A). To determine whether YP-3 forms intramo- 
lecular or intermolecular structures, we carried out a stoi- 
chiometry study. The labeled and unlabeled YP-3 and 
YP-3- 1 OA (YP-3 with 10 adenines added to its 3'-end) 
were mixed in different combinations and annealed in 
the presence of 50 mM KCl, followed by analysis in a 
20% non-denaturing polyacrylamide gel. When YP-3 
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^ Hie original sefjuences in the YYl promotei' whose potaitial G4 shuctiu es are present in tlie negative sb ands. 

Figure 2. The CD and DMS footprinting to determine the G4 structure of the oligodeoxyribonucleotides. (A) The sequences of the oligonucleotides 
for the candidate G4 structure-forming sequences located in the YYl promoter and 5'-UTR. The oligonucleotide sequences (Asterisk), their original 
sequences (section symbol) when the potential G4 structures are present in the negative strand (for YP-2, -3 and -4) and their positions are indicated. 
The guanines that likely contribute to the G4 structure formation are underlined. YP-3M and YU-4M contain the mutated nucleotides (underlined 
and in reversed cases from the other bases in the same sequences) that are predicted to disrupt the potential G4 structures in YP-3 and YU-4, 
respectively. The sequences of c-Myc G4 DNA (67), Zic-1 G4 DNA (73) and rAGA G4 RNA (50) have been previously reported. (B) CD analyses of 
the oligodeoxyribonucleotides. The red dashed lines pointed to by the closed arrow heads indicate the signature peaks of parallel G4 structures at 
262 nm, while the blue dashed lines pointed to by the open arrow heads indicate the peaks of molar ellipticity after the G4 structure-forming 
sequences were mutated. (C) DMS footprinting assay to confirm the G4 DNA structure in YP-3. The 5'-"^^P-labeled oligodeoxyribonucleotide YP-3 
was annealed under the conditions of no added cation (lane 1), in the presence of 100 mM KCl (lane 2) or 100 mM LiCl (lane 3) and then treated 
with DMS. The result of formic acid treatment (Maxam-Gilbert G+A reaction) is shown on the left. The dashed lines align the bands with the YP-3 
sequence, shown to the right. The markedly protected guanines in lane 2 compared to the other two lines are marked by arrow heads. 

(continued) 
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Figure 2. Continued. 



and YP-3-10 A were annealed individually, each of them 
exhibited a major band (lanes 1 and 2, Supplementary 
Figure S2). When each of them was annealed in the 
presence of the other oligonucleotide in either unlabeled 
or labeled status, its major band did not show any shift 
and no noticeable extra band was detected (lanes 3-7). 
The oligonucleotides without annealing treatment also 
showed two major bands migrating to the same positions 
of their correspondingly annealed samples (lanes 8 and 9), 
suggesting their instant and favorable transition to the 
G4-containing structures. Overall, the results of this 
stoichiometry study strongly suggested that the YP-3 
oligonucleotide forms intramolecular, instead of intermo- 
lecular, structures. 

The guanine N7 in double- and single-stranded DNA is 
available for methylation by DMS, while the N7 of the 
guanines in a G4 structure is inaccessible due to the for- 
mation of Hoogsteen hydrogen bonds. To confirm the G4 



DNA structure in the YYl promoter and 5'-UTR, we 
determined DMS-accessibihty of the guanine N7 in the 
ohgodeoxyribonucleotide YPl to YP-4 and YU-1 to 
YU-4 annealed in the absence and presence of lOOmM 
KCl or lOOmM LiCl. Among these oligodeoxyribonu- 
cleotides, the YP-3 exhibited the most pronounced sensi- 
tivity changes to DMS-mediated methylation (Figure 2C 
and Supplementary Figure S3). Compared to the condi- 
tions of no extra cation and lOOmM LiCl, most guanines 
at the 5'-end of the YP-3 in the presence of lOOmM KCl 
were markedly protected, indicating the presence of 
Hoogsteen hydrogen bonds, as would be expected with 
the formation of G4 DNA structures. This result is con- 
sistent with the CD study, strongly indicating the presence 
of G4 DNA structure in the YYl promoter. Among other 
YP-oligodeoxyribonucleotides, the YP-1 and YP-2, but 
not YP-4, showed partial protection (Supplementary 
Figure S3 A). Since the DMS footprinting result of the 
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YP-3 was consistent to its CD analysis, we focused on this 
region for further studies. We also observed partial 
(YU-4) or marginal (YU-1, -2 and -3) protection of the 
YU-oligodeoxyribonucleotides to DMS-mediated methy- 
lation (Supplementary Figure S3B). However, as an 
example of a G4 structure located in the YYl 5'-UTR, 
we only focused on YU-4 for further investigation. 

The strong monovalent cation dependence of the formation 
of secondary structure by oligodeoxyribonucleotide YP-3 
and oligoribonucleotide YU-4 confirms the presence 
of G4 DNA structures 

It is known that the formation of G4 structure is stabilized 
by potassium, but disfavored by lithium ions (69). 
Therefore, to confirm that ohgodeoxyribonucleotide 
YP-3 and oligoribonucleotide YU-4 self-anneal into G4 
structures, we determined the effects of the monovalent 
cationic environment on the G4 structure formation and 
thermal stabihty of the secondary structure formed by 
these ribonucleotides. We annealed oligodeoxyribo- 
nucleotide YP-3 and oligoribonucleotide YU-4 in the 
absence and presence of 50 mM KCl or 50 mM LiCl, 
and then carried out CD analyses, again using c-Myc 
G4 DNA as a control. As shown in the top panels of 
Figure 3A and B, both wild-type oligodeoxyribo- 
nucleotide YP-3 and oligoribonucleotide YU-4, similar 
to the c-Myc G4 DNA, exhibited peaks of positive 
molar elhpticity at 262 nm and negative molar eUipticity 
at 240 nm when annealed in 50 mM KCl. Both the positive 
and negative peaks of molar ellipticity were markedly di- 
minished when the YP-3 and YU-4 were annealed in 
50 mM LiCl, or in a buffer containing no additional 
monovalent cation. The CD spectra of self-annealed 
ohgodeoxyribonucleotide YP-3M and oligoribonucleotide 
YU-4M that contain mutated nucleotides in the G4 se- 
quences displayed small peaks of positive and negative 
molar ellipticity at 262 and 240 nm, respectively, but 
these peaks displayed no dependence on the local mono- 
valent cation environment (bottom panels in Figure 3A 
and B). These spectra are consistent with the formation 
of parallel G4 DNA and RN A structures, respectively, by 
ohgodeoxyribonucleotide YP-3 and oligoribonucleotide 
YU-4. It is noteworthy that the spectrum of the oligoribo- 
nucleotide YU-4 contains no evidence of an anti-parallel 
G4 structure. This is because the 3'-endo pucker of ribose 
in RNA is known to cause a prohibitively high energy cost 
to the anti-syn rotation necessary for anti-parallel G4 
structure formation (70,71). 

Marked monovalent cation dependence of the 
thermal stability of secondary structure by 
oligodeoxyribonucleotide YP-3 and oligoribonucleotide 
YU-4 confirms the presence of G4 structures 

G4 nucleic acids are thermostable structures. Therefore, 
DNA or RNA sequences with potential of forming G4 
structures typically show high melting temperatures 
(Tms)- To further confirm the G4 structures in the 
ohgodeoxyribonucleotide YP-3 and oligoribonucleotide 
YU-4, we determined the temperature dependence of the 



positive peak of molar ellipticity at 262 nm for these two 
ohgonucleotides and their mutants annealed in the 
presence of 50 mM KCl, 50 mM LiCl, or no additional 
monovalent cation in the temperature range from 20° C 
to 94°C. As shown in the top panels of Figure 3C and 
D, wild-type oligodeoxyribonucleotide YP-3 and 
ohgoribonucleotide YU-4 clearly showed higher thermal 
stability when annealed in the presence of 50 mM KCl 
with T^s of 86°C and 81°C, respectively (estimated as 
the temperature at which the molar ellipticity had been 
reduced by 50%), than those when annealed in 50 mM 
LiCl or no additional monovalent cation. Both mutant 
ohgodeoxyribonucleotide YP-3M and ohgoribonucleotide 
YU-4M exhibited decreased (53°C and 52°C, respect- 
ively, when annealed in the presence of 50mM KCl) in 
comparison with the wild-type oligonucleotides, and 
reduced dependence of the melting curves on the local 
cationic environment (bottom panels of Figure 3C and 
D). These thermal stabihty data are consistent with the 
CD spectral results shown above in supporting the 
hypothesis that ohgodeoxyribonucleotide YP-3 self- 
anneals in the presence of 50 mM KCl into a mixture of 
parallel and anti-parallel G4 DNA structures, and that 
ohgoribonucleotide YU-4 self-anneals into a parallel G4 
RNA structure. 

In vitro footprinting of the YYl promoter region with 
DNase I 

DNase I preferentially cleaves locally unwound or normal 
duplex DNA regions versus single-stranded regions or sec- 
ondary structures (72). When a supercoiled pGL3/ 
YYl-short-prmt plasmid was incubated with 100 mM 
KCl and digested with DNase I, the primer extension 
reaction indicated a protected region approximately 
from —329 to —440 of the YYl promoter, including the 
YP-3 sequence (—409 to —347), versus the condition 
without KCl addition (Figure 4, compare the two lanes 
at right). This result suggested a possible transition from 
B-DNA to a G-4 structure in the YYl promoter region, 
which provided resistance to DNase I digestion, as previ- 
ously demonstrated by Sun (60). 

G4 structure-forming sequences present in the YYl 
promoter and 5'-UTR modulate the expression 
of a reporter gene 

As we observed the presence of the sequences with the 
potential of forming G4 structures in the promoter and 
5'-UTR of YYl, we asked whether these structures may 
affect the expression of YYl. To answer this question, we 
first carried out reporter assays. As shown in Figure 5A, 
we generated five reporter constructs (see 'Materials and 
Methods' section for details). The constructs (a) and (b) 
employ the wild- type and the G4 structure-mutated YYl 
promoters (—1703 to —1), respectively, to drive Glue ex- 
pression. The constructs (c) and (d) have the wild-type and 
the G4 structure-mutated YYl 5'-UTR sequences (-1-1 to 
-1-480) inserted between the PGK promoter and Glue 
cDNA of an original reporter vector shown as (e) of 
Figure 5A. Reporter assays conducted in 293 T cells 
using the reporter plasmids (a) and (b) of the YYl 
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Figure 3. CD analyses of self-annealed oligodeoxyribonucleotide YP-3 or oligoribonucleotide YU-4 corresponding to sequences in the YYl 
promoter and 5'-UTR, respectively. (A and B). Wavelength scan of (A) oligodeoxyribonucleotide YP-3 and (B) oligoribonucleotide YU-4 
annealed in different monovalent cationic environments. (C and D). Thermal stability analyses of the secondary structure formed in self-annealed 
oligodeoxyribonucleotides or oligoribonucleotides corresponding to the sequences in the YYl promoter and 5'-UTR, respectively. The molar ellip- 
ticity at 262 nm of the annealed oligonucleotides in different monovalent cationic environments was observed at different temperatures. The poly- 
nomial fitting curve of each raw data set is shown as a dished curve with the indicated by a dropped dashed line in the same color. 



promoter demonstrated that mutations of the nucleotides 
essential to G4 structure formation resulted in increased 
YYl promoter activity (1.5-fold or 54% increase, 
P = 0.01, Figure 5B), suggesting an inhibitory role of 
the G4 motif in the YYl promoter. The reporter assays 



using the plasmids (c) and (d) indicated that the mutation 
of the G4 structure in the YYl 5'-UTR could also 
promote the expression of the downstream Glue 
(Figure 5C). We noticed that the empty vector (e) ex- 
hibited even higher expression than these two constructs 
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Figure 4. In vitro footprinting of the YY 1 promoter region with DNase 
I. DNA sequencing and primer extension reactions were carried out as 
described in the 'Materials and Methods' section. The nucleotide 
numbers for the region resistant to DNase I digestion (—440 to 
-329) and the region of the YP-3 (-409 to -347) are marked on the 
left. The YP-3 region and the directions of the YYl promoter and 
DNA sequencing are indicated on the right. The annotation 
(C,T,A,G) of DNA sequencing reaction for plasmid pGL3/ 
YYl-short-prmt is according to the positive strand of the YYl 
promoter. The primer extension of the same plasmid was conducted 
in the absence (— ) and presence (+) of 100 niM KCl. 



(c) and (d) containing the wild-type or mutated YYl 
5'-UTR sequences, respectively. This could result from 
either the difference of the distance between the 
promoter and Glue cDNA in these vectors, or the effects 
of other secondary structures in the inserted DNA frag- 
ments. Therefore, we normalized the Glue activity of (c) 
and (d) by that of the vector alone (e) in Figure 5C. After 
this calculation, we observed that the G4 structure muta- 
tions in the YYl 5'-UTR led to a markedly increased ex- 
pression of the downstream Glue (2.4-fold or 142% 
increase, P = 0.001, Figure 5D). 



The G4 nucleic acid resolvase G4R1 enhances reporter 
gene expression driven by the YYl promoter but 
not the YYl 5'-UTR 

Several helicases have been reported to resolve G4 struc- 
ture (45,47). Among them, G4R1 possesses the capabihty 
of resolving both G4 DNA and G4 RNA motifs (50). To 
explore the mechanisms underlying the regulation of the 
G4 structures in the YYl promoter and 5'-UTR, we 
carried out reporter assays to study the effects of G4R1 
on these five reporter constructs described in Figure 5A. 
When G4R1 was cotransfected with the reporter plasmid 
(a), we detected an increased Glue expression, compared 
to the sample transfected with an empty vector (1.5-fold or 
45% increase, /" = 0.01, columns 1 and 2 in Figure 6A). 
The YYl promoter containing a mutated G4 sequence 
retained the response to ectopic G4R1 by an increased 
transcriptional activity (1.3-fold or 32% increase, 
P = 0.003, columns 3 and 4 in Figure 6A). This suggests 
that a concealed G4 motif(s) may exist in the YYl 
promoter in addition to the one that we have identified. 

We carried out experiments to test the effects of G4R1 
on the YYl 5'-UTR-mediated expression. Both reporter 
constructs (c) and (d) containing the wild-type and G4 
structure-mutated YYl 5'-UTR, respectively, could be 
stimulated by the transfected G4R1 (columns 5-8 in 
Figure 6B). However, the transcriptional activity of the 
PGK-Gluc vector without the YYl 5'-UTR insert could 
also be enhanced by the ectopic G4R1, indicating that the 
PGK promoter is also responsive to G4R1. We therefore 
normalized the data of columns 5-8 by the corresponding 
Glue expression of the vector controls (columns 9 and 10). 
After this normalization, the relative expression of the 
reporter constructs with either wild-type or the G4 
motif-mutated YYl 5'-UTR displayed very similar 
activity in the absence and presence of ectopic G4R1 
(9.2% increase, _P = 0.49, between columns 11 and 12; 
4.2% decrease, P = 0.33, between columns 13 and 14; 
Figure 6C). These results indicate that the G4 sequence 
motif in the YYl 5'-UTR is unlikely a substrate of G4R1. 
It is noteworthy that the reporter construct with the 
mutated G4 structure still exhibited increased Glue ex- 
pression compared to the one with wild-type sequence 
(compare columns 13, 14 with 11, 12, Figure 6C), consist- 
ent with the observation in Figure 5D. 

In vitro determination of the interaction and resolving 
activity of G4R1 on the YYl G4 structures 

As we have demonstrated the stimulatory effect of G4R1 
on the YYl promoter, we further studied the binding 
affinity of G4R1 to the G4 DNA sequence in the YYl 
promoter in vitro. We carried out the electrophoretic 
mobility shift assay (EMSA) using recombinant G4R1 
protein purified as previously described (55) and the 
^P-labeled oligodeoxyribonucleotide YP-3 or its 
mutated form YP-3M (Figure 2A). When ^^-labeled 
YP-3 was incubated with increasing amounts of G4R1, 
slowly migrating bands with generally escalating intensity 
(compare lane 1 with lanes 2-6 in Figure 7A) were 
observed, which is hkely the complex formed by the 
G4R1 and oligodeoxyribonucleotide YP-3. However, 
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Figure 5. Reporter assay to study the effects of mutating potential G4 structure-forming sequences in the YYl promoter and 5'-UTR on gene 
expression. (A) Schematic diagrams of the constructs generated for reporter assays. Constructs (a) and (b) employ the 1703 bp YYl promoter to drive 
Glue expression. Glue expression in constructs (c), (d) and (e) was driven by the PGK promoter. Wild-type or mutated YYl 5'-UTR is inserted 
between the PGK promoter and Glue cDNA in (c) and (d), respectively. The guanines with potential of forming G4 structures are underlined. The 
mutated guanines and the replacing bases are underneath labeled by dots. The introduced SphI and Kpnl sites are indicated. Glue, Gaussia 
luciferase; PGK, phosphoglycerate kinase. The mutated sites that altered the G4 structure-forming sequences are indicated as shadowed bars in 
the YYl promoter and 5'-UTR. (B and C) Glue activity of the reporter constructs containing the wild-type and mutated (B) YYl promoter and (C) 
5'-UTR sequences, respectively. (D) Normalized Glue activity of the data in 'C. Asterisk indicates significant changes. 
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Figure 6. Effects of G4R1 on the expression mediated by the YYl promoter and 5'-UTR. (A and B) Effects of G4R1 on Glue expression mediated 
by (A) the YYl promoter and (B) YYl 5'-UTR. In 'A', G4R1 -expressing plasmid or the empty vector (500 ng) was cotransfected with either the 
reporter construct (a) containing wild-type YYl promoter sequence, or (b) with mutated YYl promoter sequence (200 ng. Figure 5A) in 24-well 
plate. In 'B', G4R1 -expressing plasmid or the empty vector (500 ng) was individually cotransfected with the reporter constructs (c), (d) and (e) 
(200 ng). The ectopically expressed G4R1 with P-actin as loading controls was examined by western blots shown under the graphs. (C) Normalized 
Glue activity of 'B'. Asterisk indicates significant changes. 
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Figure 7. EMSA to study the interaction and resolvase activity of G4R1 to the YYl G4 structures. (A) EMSA study to determine the interaction 
between G4R1 and oligodeoxyribonucleotide YP-3. An amount of 1 pM of 5'-''^P-Iabeled self-annealed YP-3 (lanes 1-6) and YP-3M (lanes 7-12) 
were incubated with increasing amounts of purified G4R1 (10-120pM), as indicated on the top. An amount of 1 pM of 5'-^~P-labeled self-annealed 
PolyA Zic-1 oligodeoxyribonucleotide was used as a positive control. The position of G4R1-DNA complexes and the unbound 5'-"^^P labeled 
oligodeoxyribonucleotides (or probes) are denoted on the left. (B) EMSA study to detect the interaction between G4R1 and oligoribonucleotide 
YU-4. Same as 'A', 5'-"''^P-labeled self-annealed YU-4 (lanes 1-7) and YU-4M (lanes 8-14) were incubated with increasing amounts of purified G4R1 
as indicated. An amount of 1 pM of 5'-^^P-labeled self-annealed rAGA was used as a positive control. (C) Gel mobility shift assay to detect 
ATP-dependency of G4R1/YP-3 association. An amount of 1 pM of 5'-''^P-labeled self-annealed YP-3 was incubated with increasing amounts (0, 25, 
75 and 300 pM) of G4R1 in the presence and absence of 5mM ATP. The samples were resolved by 10% non-denaturing polyacrylamide gel. The 
G4R1/YP-3 complexes and free YP-3 oligodeoxyribonucleotide are indicated. 



YP-3M that contains mutated oligonucleotides in the G4 
structure sequence failed to show any complex formation 
at the same conditions (lanes 7-12 in Figure 7A). In this 
study, we used a unimolecular G4 structure formed by 
self-annealing of an oUgodeoxyribonucleotide whose 
sequence corresponds to the Zic-1 G4 DNA flanked by 
10 adenines at each end (PolyA Zic-1 G4 DNA, 
see Figure 2A) (73) as a positive control (lane 13, 
Fi gure 7A), as we observed that PolyA Zic-1 G4 DNA 
efficiently associates with G4R1 (52). These results 
indicate that G4R1 can specifically bind to the G4 motif 
in the YYl promoter. It is noteworthy that the free 
^^P-labeled YP-3 migrated faster than the free 
^^P-labeled YP-3M, suggesting that the G4 motif made 
oligodeoxyribonucleotide YP-3 more compact and there- 
fore faster migrating than the YP-3M with disrupted G4 
structure. 

We also assessed the binding affinity of G4R1 to the G4 
structure of the oligoribonucleotide whose sequence is 
found in the YYl 5'-UTR. While the incubation of 
G4R1 with rAGA (Figure 2A), a G4-RNA that was pre- 
viously demonstrated to associate with G4R1 (50), 
caused the appearance of a slowly migrated band (lane 
15, Figure 7B), neither oligoribonucleotide YU-4 nor 
YU-4M showed any interaction with G4R1 (lanes 1-14). 

To determine whether G4R1 resolves or stabilizes the 
G4 DNA structure in the YYl promoter, we studied the 
G4R1 and YP-3 association in the presence and absence 
of ATP. In the presence of ATP, which is required for the 



resolvase activity of G4R1, we detected both G4R1/YP-3 
complex and free YP-3 in comparison to the control with 
only annealed YP-3 (Figure 7C, lanes 2-4 versus lane 1). 
However, in the absence of ATP, the G4R1/YP-3 associ- 
ation was markedly increased, while the detected free 
YP-3 was largely reduced (lanes 5-7). This result 
suggests that G4R1 binds YP-3 tightly in the absence of 
ATP, but resolves and then releases YP-3 when ATP is 
provided. The binding, resolving and refolding are rapid 
and dynamic. Therefore, the free YP-3 could very quickly 
transform back to the favored compact G4 structures after 
being resolved by G4R1 and therefore the slowly 
migrated, unstructured YP-3 was not observed in the 
lanes 2-4 of Figure 7C. 

G4R1 binds the YYl promoter and its manipulated 
expression affects endogenous YYl levels 

As we observed the association of G4R1 and oHgodeox- 
yribonucleotide YP-3 in vitro, we asked whether G4R1 
binds the YYl promoter in cells. Therefore, we carried 
out Chip assays with the anti-G4Rl antibody, using 
non-specific IgG (sc-2343) as a negative control and 
histone H3 antibody as a positive control. As shown in 
the top panel of Figure 8A, G4R1 exhibited ~5-fold 
higher binding affinity than the control sainple 
(P = 0.004), implicating that G4R1 directly regulates 
YYl gene expression. As a positive control, the histone 
H3 antibody showed stronger signal than the G4R1 
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Figure 8. ChIP assay of G4R1 affinity to the YYl promoter and effects of altered G4R1 expression on endogenous YYl levels. (A) ChIP assay of 
the YYl promoter. Non-specific mouse IgG (control), G4R1 antibody and histone H3 antibody were used in the immunoprecipitation. The amplified 
regions aligned to the YYl promoter and mRNA (with transcription starting site designated as '+1') are indicated. The amounts of the YYl DNA 
precipitated by these antibodies relative to the input are presented. The results are derived from three separate experiments. Asterisk indicates 
significant changes. (B) Effect of ectopic G4R1 on endogenous YYl expression in 293 T cells. Increasing amounts of G4R1, as shown above the blot, 
were transfected into 293 T cells that expressing relatively low levels of endogenous G4R1. The cell lysates were analyzed by western blots using the 
antibodies indicated on the left. (C) Effects of G4R1 knockdown on endogenous YYl expression in HeLa cells. HeLa cells that express high levels of 
endogenous G4R1 were infected by lentivirus expressing an inducible shRNA against G4R1 (54) cultured in medium containing 1.5|xg/ml of 
doxycycline (Dox) to induce the shRNA expression. The cells were collected at different time points as indicated on the top and the cell lysates 
were analyzed by western blots using the antibodies indicated on the left. 



antibody. To exclude the non-specific binding of the G4R1 
antibody, we also amplified a region in the YYl exon 5 
that has relatively low G/C content (17.3% of either G or 
C). As presented in the bottom panel of Figure 8A, only 
histone H3 antibody showed high binding affinity to this 
region, while G4R1 antibody exhibited comparable signal 
to the negative control antibody. 

Since G4R1 binds to the YYl promoter and stimulates 
its activity in driving Glue expression in the reporter 
assays, we asked whether G4R1 affects the expression of 
endogenous YYl. We first transiently transfected 
increasing amounts of G4R1 into 293T cells, as they 
express relatively low levels of G4R1, and determined 
the endogenous YYl levels by western blot. As shown in 
Figure 8B, YYl expression was elevated with increasingly 
expressed G4R1. To determine whether G4R1 is required 
in maintaining endogenous YYl expression, we inducibly 
knocked down G4R1 in HeLa ceUs that express high levels 
of G4R1. As shown in Figure 8C, the depletion of G4R1 
only shghtly reduced the levels of endogenous YYl, sug- 
gesting that G4R1 may not play a crucial role in main- 
taining YYl expression. 

G4R1 expression generally correlates with YYl 
expression in breast cancer samples 

As ectopic G4R1 affects endogenous YYl, we proceeded 
to determine whether there is any correlation between the 



expression levels of these two proteins. First, we took 
breast cancer as an example to test G4R1 and YYl ex- 
pression in some commonly used cell fines with normal 
human mammary epithehal cells (HMEC) as controls. 
We also included MCF-lOA cells that are non- 
tumorigenic but immortalized. The tumorigenic cell hues 
included HEK (HMEC immortalized by SV40 large-T 
antigen, the telonierase catalytic subunit and an H-Ras) 
(74), SK-BR-3, ZR-75-1, BT-474 and MDA-MB-231. We 
analyzed an equal amount of cell lysates from these breast 
cell hues by western blot using antibodies for G4R1, YYl 
(H-10) and p-actin. As shown in Figure 9A, 
non-tumorigenic MCF-lOA cells and all tumor cefi fines, 
except BT-474, expressed elevated G4R1 expression 
compared to HMEC, while YYl levels are markedly 
increased in all tumor cefi fines compared to MCF-lOA 
and HMEC. These resufis suggest that both G4R1 and 
YYl are overexpressed in most breast cancer cells. To 
determine whether G4R1 expression correlates with YYl 
levels in primary human breast cancer, we analyzed a set 
of Affymetrix microarray expression profiles derived from 
the Uppsala breast cancer cohort consisting of 258 patient 
samples (64) using the probes indicated in the 'Materials 
and Methods' section. As shown in Figure 9B, the gene 
expression patterns of G4R1 and YYl in these 258 breast 
tumors exhibited a significant correlation (P = 4.5 x 10~^, 
Pearson correlation), which is consistent with the results 
obtained from our in vitro studies. 
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Figure 9. Studies of G4R1 and YYl expression in breast cancer cell 
lines and patient samples. (A) G4R1 and YYl expression in different 
non-malignant and malignant breast cell lines. Total cell lysates of the 
cell lines (labeled on the top) were analyzed by western blot (antibodies 
labeled on the left). HMEC and MCF-lOA were used as non-malignant 
controls. HEK cells are a breast cancer cell line immortalized by SV40 
large-T antigen (74). (B) Analysis of the expression correlation between 
G4R1 and YYl in breast cancer samples in the Uppsala breast cancer 
cohort consisting of 258 patient samples. The signal intensities of G4R1 
and YYl were logarithmically transformed. A significant, positive cor- 
relation is suggested by the Pearson Product Moment Correlation 
P-value of 4.5 x 10^ 



DISCUSSION 

The regulatory activities of YYl in different epigenetic 
processes have implicated its critical role in cell prohfer- 
ation, differentiation and tumorigenesis. As a multifunc- 
tional transcription factor, YYl has been extensively 
studied in its regulation towards the expression of 
various target genes. However, the mechanisms 
underlying how YYl gene expression is regulated are rela- 
tively understudied. In this report, we revealed the G/C 
rich features in the vicinity of the YYl transcription start 
site on the human YYl promoter and 5'-UTR. We then 
provided clear in vitro evidence to demonstrate that these 
regions contain sequences capable of forming G4 DNA 
and RNA structures, respectively, and that mutations of 
these G4 structure-forming sequences affected the YYl 
promoter-mediated Glue expression or that downstream 
of the YYl 5'-UTR in reporter assays. Our mechanistic 
studies also suggested that the G4 nucleic acid resolvase 
G4R1 may release the G4 structure in the YYl promoter, 
but not the one in the YYl 5'-UTR. To investigate the 
biological relevance of our study, we also analyzed the 
gene array of 258 human breast cancer samples and dis- 
covered the significant correlation between G4R1 and 
YYl gene expression. 

In the promoter region, G4 DNA structures that 
regulate the gene expression may exist in either the 
positive or negative strand. It is noteworthy that 



cytosine rich (C-rich) DNA sequences may form i-motifs 
that have been indicated to regulate the expression of 
multiple genes, including Bcl-2 and c-Myc (75,76). 
Therefore, it is reasonable to predict that, while G4 
motifs are present in one strand, structures such as 
i-motifs can also be formed in the Watson-Crick comple- 
mentary strand. They may collaboratively regulate the 
transcriptional activity of the Y Y 1 promoter, as described 
in a model for the c-Myc promoter (40). 

The YYl promoter likely forms multiple G4 structures 
due to its high G/C content and it is not practical for us to 
determine all of them in this study. However, our results 
strongly indicated the presence and regulatory role of the 
G4 structure in the YYl promoter. In the YP-3 ohgo- 
nucleotide, there are 7-9 isolated G-runs (>3 Gs) and 
our stoichiometry experiment suggested the majority of 
this oligonucleotide formed intramolecular structures. 
Therefore, we predict that the YP-3 ohgonucleotide 
forms multiple, dynamic G4 structures among these 
G-runs that may rapidly interchange, leading to the pro- 
tection of multiple guanines in the DMS footprinting 
assay (Figure 2C). Consistently, our CD studies also sug- 
gested the presence of G4 structures with anti-parallel 
strands in the YP-3 (Figure 2B). The dynamic G4 struc- 
ture formation has been demonstrated previously (77,78), 
which makes it technically challenging to create a sche- 
matic picture to present the predicted G4 structures and 
would be beyond the scope of this current study. 

We demonstrated an interaction of G4R1 with the G4 
sequence motif in the YYl promoter, as well as a stimu- 
latory effect of G4R1 on the expression driven by the YYl 
promoter. However, we also observed that G4R1 could 
still enhance the expression of the reporter construct 
driven by the G4 structure-mutated YYl promoter, as 
shown in Figure 6A. Since the region close to the tran- 
scription start site in the YYl promoter is highly G/C rich, 
this effect could result from other G4 DNA structure(s) 
that has not been identitied. It is also possible that the 
activity of the YYl promoter is inhibited by other types 
of structures, which can be resolved by G4R1. 

In the EMSA studies, we did not detect large amounts 
of unbound oligonucleotides (Figure 7), since only 1 pM 
of 5'-^^P-labeled self-annealed YP-3 ohgonucleotide was 
incubated with 10-300 pM of G4R1. The reason to use 
this ratio is that these concentrations represent standard 
binding affinity determination conditions. When 
determining the binding affinity of an annealed oligo- 
nucleotide, it is important that the ohgonucleotide is 
used at a much lower concentration than that of the 
resolvase. Given the extraordinarily tight binding affinity 
of G4R1 for G4 nucleic acids (52), the concentration of 
the G4 oligonucleotides should be kept close to 1 pM. As 
shown in Figure 7C, G4R1 tightly bound YP-3 in the 
absence of ATP. This binding interaction at low YP-3 
concentrations is strongly indicative of the presence of 
G4 structures in YP-3, as G4R1 prefers to bind G4 
nucleic acids in comparison to unstructured single 
stranded DNA or Watson-Crick duplex DNA by at 
least 3 orders of magnitude (50,55). YP-3 was released 
from the G4R1 after resolution of the G4 structure 
when ATP was added (52). This result strongly suggests 



Nucleic Acids Research, 2012, Vol. 40, No. 3 1047 



that G4R1 resolves the G4 structure in the YP-3 
ohgonucleotide. 

When testing the effects of the YYl 5'-UTR on the ex- 
pression of downstream Glue, we employed the PGK 
promoter. In our study, we observed that this promoter 
can be stimulated by G4R1, suggesting that it may 
contain G4 DNA structure motif(s). It is certainly 
possible that G4R1 possesses other activities, in addition 
to resolving G4 structures, which may generally promote 
gene expression. After examining the DNA sequence, we 
found that this promoter is indeed G/C rich (>63%) and 
has multiple G- or C-runs. We also checked several other 
commonly used promoters with medium expression 
strengths, including chicken p-actin and ubiquitin C pro- 
moters. They both have high G/C contents and potential 
G4 DNA-forming sequences. CMV promoter has nearly 
equal amounts between G/C and A/T, but its robust ex- 
pression strength could overwhelm the effect of G4R1 on 
any putative G4 structure. Therefore, we still used the 
PGK promoter to study the YYl 5'-UTR-niediated ex- 
pression and compensated the effects of G4R1 by 
normalizing the data against those of the reporter 
construct without the YYl 5'-UTR insert. After this data 
processing, we concluded that the YYl 5'-UTR-mediated 
Glue expression was unresponsive to the ectopically 
introduced G4R1. Consistently, G4R1 did not bind to 
oligoribonucleotide YU-4 in our EMSA study 
(Figure 7B). Whether the G4 RNA motif in the YYl 
5'-UTR is a substrate of other resolvases, such as BLM, 
WRN, or FANCJ, needs further investigation. It is hkely 
that multiple G4 structures may exist in the 480-nt YYl 
5'-UTR with such a high G content (44.4%), a speculation 
that is supported by the CD studies of the oligonucleotides 
YU-1 and YU-2 (Figure 2B). However, we believe that 
G4R1 does not bind these two ohgonucleotides, since the 
expression of the reporter construct containing the entire 
5'-UTR region of YYl was not affected by ectopic G4R1. 

While G4R1 may possess general stimulatory effects on 
gene expression, especially oncogenes that have G4 struc- 
tures in their promoters, our data from the CD spectral 
study, reporter assay and correlated G4R1-YY1 gene ex- 
pression strongly suggest that YYl is one of G4R1 target 
genes and this regulation may contribute to cancer develop- 
ment. It is noteworthy that G4R1 knockdown did not 
markedly affect endogenous YYl levels, although 
ectopically expressed G4R1 led to an increased YYl expres- 
sion (Figure 8B and C). This indicates that YYl expression 
is modulated by multiple regulatory mechanisms, in addition 
to G4 DNA/G4R1. Based on our data, G4R1 likely plays a 
role in promoting, but not maintaining, YYl expression. 

The sequences of the G4 structures may contain or 
overlap with the binding sites of certain transcription 
factors. Thus, it is possible that the mutagenesis of the 
guanines essential to G4 structure formation in the YYl 
promoter could alter the affinity of certain transcription 
factor(s), likely repressor(s), and in turn enhance the 
promoter activity in reporter assays. This dilemma is a 
generic issue in the field of G4 structure research. In our 
studies, in addition to the data from reporter assays that 
demonstrated the effects of G4 DNA-forming sequence 
and G4R1 on the YYl promoter-mediated gene 



expression, we also presented spectroscopic evidence 
from CD analyses and protection of the YYl G4 DNA 
structure from DMS-mediated methylation and DNase I 
digestion. AU data sets support the presence and the regu- 
latory role of G4 structure in the YYl promoter. 

Numerous reports demonstrated the potential regula- 
tion of YYl in tumorigenesis. Although most studies 
suggest an oncogenic or proliferative role of YYl in 
cancer development and progression [see the review (3)], 
a handful reports also proposed the possible anti-cancer 
activities of YYl (10,79-81). Statistical analysis indicates 
that G4 structures are present in the promoters of most 
proto-oncogenes, while the promoters of tumor suppres- 
sors have very low levels of guanine runs (43). Therefore, 
our current study showing the G/C rich feature and the 
presence of quadruplex in the YYl promoter supports the 
theory of YYl as an oncogene. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Figures SI -S3. 
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